Discovery and characterization of novel DNA viruses in Apis mellifera: expanding the honey bee virome through metagenomic analysis

ABSTRACT To date, many viruses have been discovered to infect honey bees. In this study, we used high-throughput sequencing to expand the known virome of the honey bee, Apis mellifera, by identifying several novel DNA viruses. While the majority of previously identified bee viruses are RNA, our study reveals nine new genomes from the Parvoviridae family, tentatively named Bee densoviruses 1 to 9. In addition, we characterized a large DNA virus, Apis mellifera filamentous-like virus (AmFLV), which shares limited protein identities with the known Apis mellifera filamentous virus. The complete sequence of AmFLV, obtained by a combination of laboratory techniques and bioinformatics, spans 152,678 bp. Linear dsDNA genome encodes for 112 proteins, of which 49 are annotated. Another large virus we discovered is Apis mellifera nudivirus, which belongs to a group of Alphanudivirus. The virus has a length of 129,467 bp and a circular dsDNA genome, and has 106 protein encoding genes. The virus contains most of the core genes of the family Nudiviridae. This research demonstrates the effectiveness of viral binning in identifying viruses in honey bee virology, showcasing its initial application in this field. IMPORTANCE Honey bees contribute significantly to food security by providing pollination services. Understanding the virome of honey bees is crucial for the health and conservation of bee populations and also for the stability of the ecosystems and economies for which they are indispensable. This study unveils previously unknown DNA viruses in the honey bee virome, expanding our knowledge of potential threats to bee health. The use of the viral binning approach we employed in this study offers a promising method to uncovering and understanding the vast viral diversity in these essential pollinators.

All viruses found in certain spaces are called viromes.In bees, the virome consists of bee-infecting viruses, viruses infecting other eukaryotes living in/on bees (viruses of parasites), bacteriophages, and transient viruses present in pollinator resources such as pollen (12).Several new bee viruses have been discovered using next-generation sequencing (NGS) techniques (13)(14)(15).Thanks to NGS, approximately 72 viruses have been described in honey bees (16).Most of viruses found in honey bees belong to the +ssRNA group, such as the Iflaviridae and Dicistroviridae.So far, only a few DNA viruses have been identified: the large dsDNA virus Apis mellifera filamentous virus (AmFV), discovered in 1978 (17) but fully sequenced in 2015 (17,18), and the recently discovered small single-stranded DNA viruses belonging to the families Genomoviridae and Microviridae (19).
In our recent project focusing on the honey bee virome, we found sequences of bee DNA viruses belonging to the Parvoviridae family, a large DNA virus related to the AmFV virus, and another to Nudiviridae.Parvoviruses, especially those belonging to the subfamily Densovirinae, are well-known small nonenveloped viruses that infect insects (20).The genome of parvoviruses consists of linear ssDNA, 4-6 kb long, which contains two major expression cassettes with open reading frames (ORFs) that code non-struc tural proteins and structural capsid proteins (21,22).The known members of this family are highly pathogenic for their insect hosts (20).The other sequences we have identified are similar to a large dsDNA AmFV with a genome of over 450 kb.This virus has not yet been classified and some of its proteins show identities with proteins of the family Baculoviridae (18).Individuals with high viremia have milky hemolymph due to cellular degradation caused by the presence of virions and show signs of weakness in crawling bees at the entrance.Although the virus is widespread in colonies in different parts of the world (23,24), clinical symptoms are detected only sporadically (17,25).Nudiviridae are also a group of viruses that infect insects and crustaceans; they are enveloped dsDNA viruses with a large genome of 90-230 kb (26).
In this study, we, for the first time, describe bee parvoviruses, Apis mellifera filamentous-like virus and Apis mellifera nudivirus (AmNV).Laboratory and bioinformatic approaches were combined to complete the genomes of these new honey bee viruses.Viral binning and the creation of vMAGs (viral metagenome-assembled genomes) were used for the first time, to our knowledge, for the genome assembly of honey bee viruses.

Honey bee collection, sample processing, and libraries preparation
Bees were collected from 18 hives in five different locations in Czechia [Lisnice, Libechov, Prasily, Brdy/Nerezin, and Prague-Ruzyne in the Crop Research Institute (abbreviated as VURV)].In addition, two of the five VURV colonies used for analyses were moved from a different site out of the flying range to a demarked place in VURV two weeks before sampling.Immediately after sampling by shaking from a brood comb frame into a plastic bag, they were placed in a polystyrene box on dry ice.Then, the bees were divided to sterile centrifugal tubes and stored at −80°C until further use.Except for two colonies, which had obvious signs of varroosis, i.e. crippled wings, all hives were denoted to be "healthy, " that is, the colonies had rapid build-up, showed no damaged capping, and no signs of overt diseases or Varroa infestation were observed.The list and specifications of the samples are available on GitHub (https://github.com/kadlck/NAZV19).
Fifty randomly selected bees from each hive were pooled in one sample.As we suggested in (24) this number of bees in the sample should allow for capturing the full diversity of eukaryotic viruses in the colonies.We processed the bees as descri bed in detail before (24).Briefly, the homogenization was done in four 5mL tubes with ceramic beads (Bertin technologies, Montigny-le-bretonneux, Ile-de-France, France), after centrifugation and filtration the bees were pooled into two separate aliquots of 25 bee pools, and after extraction of encapsulated nucleic acids with QIAamp Viral RNA Kit (Qiagen, Hilden, Germany), the two 25 bee pools were combined back into one sample of 50 bees.The reverse transcription of RNA and amplification of cDNA/DNA was done with WTA2 kit (Sigma-Aldrich, St. Louis, Missouri, United States).Libraries were prepared with Nextera XT (Illumina, San Diego, California, USA).The sequencing was performed in several runs on NextSeq 500 using Mid Output Kits (Illumina, San Diego, California, USA) for 2 × 150 bp paired-end cycles, with minimum of 10M reads per sample.The 1xPBS was used as a negative control and has been processed through all the steps (from homogenization to sequencing) to exclude possible contamination of samples and reagents during processing (see Fig. 1 for overview).
We considered removing the host reads from the samples prior to analysis but they made up on average ~36% of all reads.When we compared the results of assembly of dehosted reads with those non-dehosted we found that the statistic of the scaffold was in very slight favour of non-dehosting option on most of the samples.Therefore, we continued with reads containing the host reads (Table 1 and full statistics on GitHub).

Bioinformatics-the creation of MAGs
Since we found contigs belonging to large viruses, we tried binning and creating vMAGs, see Fig. 1.Several different steps and softwares were tried on test samples, and the best-performing software was used further.First, we predicted viral contigs from the scaffold (>1,500 bp) using Virify v0.4.0 (33) and geNomad v1.2.0 (https://github.com/apcamargo/genomad).Trimmed reads were mapped back to the viral contigs and then were binned using vRhyme v1.1.0(34).The viral contigs were also checked with CheckV v1.0.1 (31), whose output was used for dRep v3.4.0 (35), removing bins with less than 50% completeness.The complete genomes (ITR/DTR/circular genomes) identified by vRhyme (34) and CheckV v1.0.1 (31) were extracted and added to the bins.Bins were checked for mismatched classifications [Diamond (29) against IMG/VR v4 (36) and NR, geNomad classification], and split if necessary.They were further split on the classification which we were able to designate with certainty, considering the percent of identity and CheckV statistics (contamination/warnings).Furthermore, the high-quality vMAGs/ genomes from all samples were gathered, and CheckV v1.0.1 (31) with dRep v3.4.0 (35) was run again.This resulted in a non-redundant set of vMAGs that were at least 50% complete.They were classified against IMG/VR v4 (36) and the NR protein database (NCBI).All suspicious vMAGs (based on CheckV statistics and warnings; or with uncertain or disputable classification), with a focus on eukaryotic viruses, were checked again over the whole length of the sequence against the NR in BLAST at NCBI and manually curated.Most of these cases with difficult assignment/splitting were phages.

Completing vMAG with additional sequencing
For the processing of AmFLV), the sequences of the three largest contigs were extended by amplicon sequencing.Primers were designed at both ends of the three contigs (available on GitHub, see Data Availability for link), and PCR was performed with primer combinations using Phusion polymerase (Thermo Fisher, Waltham, Massachu setts, USA) according to manufacturer's protocol with three different extension times (2.5, 5, and 10 min).The resulting amplicons were purified using MSB SPIN PCRAPACE (Invitek Molecular, Berlin, Berlin, Germany).The concentrations were measured using Qubit dsDNA HS Assay kit (Thermo Fisher, Waltham, Massachusetts, USA).The library was prepared using Nextera XT kit (see NetoVIR protocol) with 5-min tagmentation, sequenced, and analyzed as described above.

Determination of the complete nudivirus genome
First, we identified a few relatively short contigs which could be attributed to nudiviruses, the longest contig of length 16,645 bp was obtained from one sample (Lisnice11).To enhanced detection of DNA, the sample was sequenced again without WTA2 preamplification which includes reverse transcription.One sample contained more than 50,000 reads attributable to Nudiviridae contigs (Lisnice24).All obtained reads from three samples were co-assembled with setting of SPAdes as described above.Then, we mapped individual samples on the scaffold, used vRhyme to get bins, extended them from the assembly graph with binSPReader, pre-release (37).The extended nudivirus bin was checked with Virsorter2 v2.2.4 (38) and DRAM v1.4.6 (39) to ensure that all contigs we have gained had a majority of Nudiviridae alignments.Then, we tried to get cleaner reads to resolve some regions.We mapped the reads of these three samples on the bin of four contigs undoubtedly belonging to Nudiviridae and extracted mapped reads (with a mate in case of one read from pair mapping).We repeated the mapping but on scaffold (>1 kbp) gained in co-assembly and extracted unmapped reads.The reads were from the three samples, mapped on nudivirus and unmapped on anything else (scaffold from co-assembly of three samples, greater than 1,000 bp).With these reads, we tried assembly, the best was obtained with SPAdes v3.15.5, --metaviral -k 21, 33, 55, and 77 settings which resulted in three contigs, 76,482, 24,683, and 22,709 bp (= 123,874 bp).Therefore, we design primers similarly as for AmFLV (available at GitHub, see Data Availability for link), on the ends of contigs aiming outward.We performed PCR with different combinations of primers with Phusion polymerase and using different elongation times.The obtained fragments were purified and the library using Nextera XT (tagmentation based on lengths of amplicons: 2, 3, and 7 min) was prepared and sequenced on MiSeq with reads 2 × 250 bp.

Comparison with published data sets
After finding possible new viruses in our de novo data, we analyzed sequences obtained by NGS from studies which used protocol for sample processing that allows detection of both RNA and DNA viruses (40).FastQ files obtained in the study by Deboutte et al. (41) were pulled from the Sequence Read Archive (SRA) archive using prefetch and fastq-dump available from NCBI, and all non-redundant scaffolds from all the samples were made available on GitHub.Additionally, we used data from our previous study (24).The non-redundant scaffolds from both studies were classified with Diamond blastx against NR with --sensitive and -c 1.
The viral set we created for further analyses consisted of novel viral genomes detected in the current study combined with viruses found in the previous studies (24,41) as described above.The viral set contains all the vMAGs/genomes.Reads from the studies were mapped on them using bwa-mem2 v2.2.1 (32), and the coverage was extracted using CoverM v0.6.1 (https://github.com/wwood/CoverM).

Sequencing statistics
The sequencing statistics are listed in Table 1.We gained more than 10M of reads per sample.Also, a large number of reads mapped back to our curated vMAGs, showing efficiency of our protocol.

Parvoviridae
NGS of 18 analysed samples resulted in 77,423 reads which belong to the Parvoviridae family.This presents approximately 0.02% of all reads and 0.03% of those determined to be viral reads.These reads were present in 7 out of 18 samples.In the analyzed sequences from three studies from which we gained contigs, we found 16 unique contigs corresponding to the subfamily Densovirinae: nine were complete and seven were incomplete genomes.The complete genomes were named Bee densoviruses 1 to 9. The lengths of the complete genomes ranged from 3.6 to 6 kbp.All complete genomes contained two major ORFs that code structural and non-structural proteins with Parvoviridae-specific motifs and other short ORFs encoding additional hypothetical proteins surrounded by non-coding regions at both ends (Table 2; Fig. 2).The number of proteins and their predicted function based on similarity are shown in Table 3.
Terminal repeats (TRs) were present in five of the nine complete genomes (Bee densoviruses 1, 2, 3, 6, 9).Detailed information and predicted structure of TRs are shown in Table 2 ; Fig. 2 in the file with additional densoviruses figures at https://github.com/kadlck/NAZV19.The TRs were completely identical for Bee densoviruses 2 and 6 or had 1-2 bp mismatches for the other Bee densoviruses.The rest of the complete genomes lack terminal repetition; however, the ORFs are flanked by non-coding regions in length ranging from 14 bp (Bee densovirus 4) to 414 bp (Bee densovirus 7) at the 5´ end of the genomes and from 104 bp (Bee densovirus 4) to 229 bp (Bee densovirus 5) at the 3´ ends.
Incomplete genomes (3.7-4.5 kbp) contain neither non-coding regions nor com plete ORFs.They were therefore deemed incomplete and are available on GitHub (see Materials and Methods for link).
The bee densoviral genomes were highly variable; they differed in length and terminal repetitions, and predicted ORFs.The sequence similarity of the genomes ranged from 24% to 44% across the whole genome (Table 2 in file with additional densovi ruses figures at https://github.com/kadlck/NAZV19).This variability was confirmed by phylogenetic analysis using sequences retrieved from NCBI (Fig. 3 in file with addi tional densoviruses figures at https://github.com/kadlck/NAZV19).The genomes were distributed through the phylogenetic tree based on NS1-superfamily region-conserved sequences of densoviruses retrieved from NCBI.Bee densovirus 4 was closest to a member of Miniambidensovirus, Acheta domestica mini ambidensovirus (37.0%similarity over the whole sequence), and Bee densovirus 7 to Scindoambidensovirus but also to members from the family Densoviridae (previously subfamily Ambidensovirus) that lack recent classification.Bee densovirus 8 was closest to sequences belonging to Atrato Denso-like viruses and Broome densovirus (31.2%, 30.3%, 31.2%).The group which included Bee densoviruses 1 and 5 had the highest similarity to Ambidensovirus sp., 40.6% for Bee densovirus 1 and 45.6% for Bee densovirus 5. Bee densovirus 3 was closest to Tarsiger cyanurus ambidensovirus with a similarity of 36.6% over the whole sequence.Bat-associated densovirus had a 50.6% similarity to Bee densovirus 7. Bee densovirus 6 had the highest similarity with 75.7% to Periparus ater ambidensovirus.For Bee densovirus 2, the similarity to Phylloscopus inornatus ambidensovirus was 59.9%.And finally, Bee densovirus 9 had a similarity of 46.7% to the Ambidensovirus sp.

Apis mellifera filamentous-like virus (AmFLV)
From de novo sequencing, we identified one contig 103,615 bp long with low similarity to AmFV polymerase (30.8%).The retrieved sequence was extended as a vMAG with five more contigs (77,565 bp, 43,940 bp, 8,006 bp, 5,756 bp, and 4,443 bp), and CheckV predicted the vMAG as 92.9% complete.A ~10 kb amplicon was obtained using PCR with specific primers designed at the end of the large contigs (see GitHub, https:// github.com/kadlck/NAZV19,for the scheme).After sequencing the amplicon, assembly, and polishing, a linear genome with TRs was obtained.Out of 39,950,966 reads, 448,034 reads (1.1% of the sample where contigs were discovered, region Brdy, specifically Brdy1) mapped to the complete genome.The complete genome length, provisionally named AmF-like virus, was 152,678 bp long with a GC content of 49.8%.A scheme of the genome is shown in Fig. 3A.The sequence of AmFLV was flanked by inverted homolo gous TRs 77 bp long, forming a Y-shape at both ends (Fig. 3B).A total of 112 ORFs were identified; the putative ORFs were distributed on both strands, and 49 (43.8%) were identified in protein databases (Viral, Peptidase, Pfam, Cazy, Vogdb, KEEG).
The predicted proteins mostly showed protein similarity to AmFV (similarity from 16.8% to 51.5%), but also other large viruses (see GitHub, https://github.com/kadlck/NAZV19, for all hits): protein numbered AmFLV_89 to trimeric dUTPase of Vombatid gammaherpesvirus 1 (55.1%), protein AmFLV_76 to orf66 gene product of Helicoverpa zea nudivirus 2 (58.4%), protein AmFLV_2 to inhibitor of apoptosis (iap) of Trichoplu sia ni single nucleopolyhedrovirus (43.0%),AmFLV_34 to ribonucleotide reductase/HP APL35_gp114 of AmFV (50.4%), and protein AmFLV_74 to putative endonuclease of  Emiliania huxleyi virus 86 (42.9%) and AmFLV_87 to DNA polymerase of AmFV (28.4%).Like other large viruses, this virus encodes its own DNA polymerase and proteins like dUTPase and metalloproteinase that can affect host cell metabolism.Additionally, this virus codes proteins like inhibitors of apoptosis or per os infectivity factors.On average, the identity of proteins with known function was higher than that of hypothetical proteins, and even higher for proteins that affect cell metabolism.However, even with several significant alignments (mostly to AmFV), most of the proteins are hypothetical or have no significant alignments detected.Some characteristics of the new virus are similar to AmFV, like the presence of the Baculoviridae-related regions (pif-1/2/3), which are important for cell entry and are essential for per os infection.The similarities were relatively high (41.7%,51.5%, and 41.7%) and we found pif 1-3 which form the conserved per os infectivity complex in Baculoviridae (52) or the presence of the kinesin motor domain which could be one of the components responsible for affecting cytoskeletal dynamics by viral infection.It may be important that the virus has significant similarity to one hypothetical protein of AmFV (APL35_gp042, 42.4%), which could encode integrase/recombinase closest to the phage integrase family (Pfam, Vogdb).The identified AmFLV virus was also detected in our previous study (24) and in the study by Deboutte et al. (41) (Table 4).
The phylogenetic analysis was done with a representative of large viruses (from NCBI, random selection from RefSeq DNA polymerases sequences in each family) and shows that the AmFV, based on sequences of polymerases, is the closest representative, but still clearly distinct (Fig. 3C).
The vMAG was made up mainly of three large contigs (103, 77, and 44 kbp), but only the 103 and 44 kbp contigs were mapped to the AmFLV.All three contigs have very similar mapping patterns (Table 4) across the samples where the contigs were present.The sequence of the third 77 kbp contig (made available on GitHub, https://github.com/kadlck/NAZV19) probably belongs to another large virus that infects honey bees.The inclusion of the 77 kbp contig in vMAG isn't unexpected.The probability that the 77 kbp contig doesn't contain a sequence of AmFLV is supported by the finding of a higher number of reads mapping to the shorter 77 kbp fragment in comparison to the 103 kbp contig and only two similarities to AmFV detected in the 77 kbp contig in comparison to dozens in the 103 and 44 kbp contigs (Table 4).Additionally, when we include the 77 kbp contig into vMAG, the two genes of ribonucleoside-diphosphate reductase were identified.

Apis mellifera nudivirus (AmNV)
At the beginning, we had 12 contigs which we reduced to three contigs by co-assembly (see Materials and Methods), 76,482, 24,683, and 22,709 bp (= 123,874 bp) with high reliability and continuity.They all had a number of hits to Nudiviridae.With primers that we designed at the ends of these contigs, and directed outward utilizing PCR, we were able to obtain three amplicons (<500 bp, <900 bp, <10 kbp) in reasonable combinations which make the virus circular and complete (see GitHub, https://github.com/kadlck/NAZV19, for the scheme).These amplicons were sequenced and the genome of the new nudivirus found in honey bees was completed.We gained a circular genome of 129,467 bp, coding 106 proteins on both strands of the viral genome.The virus had 40.3%GC content, was named Apis mellifera nudivirus, and taxonomically belongs to Alphanudivirus (Fig. 4).On the virus, mapped 219,147 reads out of 31,076,599 (0.7%)The number of reads mapping is shown.
The virus contains a complete core of per os infectivity complex (pif-1/2/3) and all genes we expect to see in a member of Nudiviridae (core genes), mainly DNA polymerase B, integrase/recombinase, DNA-directed RNA polymerase, and DNA helicase 2. Apart from Nudiviridae proteins, AmNV has other significant alignments, which might be specific for honey bees (like alignment to APL35_gp193 of AmFV).

DISCUSSION
Historically, honey bees have been thought to be primarily associated with a plethora of RNA viruses, belonging to the order Picornavirales, together with the most common members of Iflaviridae and Dicistroviridae.In stark contrast, only a handful of DNA viruses have been documented in this managed pollinator (18,19).Our analysis in this study of 18 NGS samples, each representing a pool of 50 bees, together with sequences from two other studies (24,41), has significantly expanded this DNA virome landscape that may be present in different honey bee populations.In our previous article, we suggested to use a number of pooled bees per sample, because this approach should allow to detect the diversity of honey bee viruses in the colony, and also low prevalence viruses (i.e., those present in only part of bees in a colony).Even if the viruses are in low abundance in the pooled sample, they can be used for de novo genome assembly (24).de novo assembly and the subsequent generation of vMAGs from the sequences performed in this study revealed several new DNA viruses.The benefit of analyzing larger numbers of bees can be documented by the detection of low prevalence viruses and proves to be beneficial in reflecting the diversity of DNA viruses in honey bees.
Most of the new DNA viruses could be classified as members of the Densovirinae subfamily.Densoviruses have a classical structure, two coding ORF cassettes and untranslated regions at the ends, with five out of nine having terminal repeats while others lack them.Not all known members of Densovirinae have terminal repetition at their genome ends (50).Our constructed phylogenetic tree was based on the conserved domain of NS1 even though the most described genera of Densovirinae clustered together, group previously classified as Ambidensovirus, and encompassing all the new -ambidensoviruses, were distributed through the tree with low bootstrap support.Members of the group are known to have even less than 30% similarity within the genus (22).The latest revision of the Parvoviridae taxonomy split the family into seven more lineages, some of which having greater similarity to Iteradensovirus than to other members of the group previously classified as Ambidensovirus (22).
The impact of Densovirinae on arthropods varies, ranging from overt pathologies (20,54) to mutualistic relationships (55).Most members of the subfamily Densovirinae cause lethal infection of their hosts.The first symptoms are often anorexia and lethargy, followed by flaccidity, progressive paralysis, slow melanization, and tumor development (20).With a well-defined and distinguishable set of symptoms and a high viral load in a sick or dead animal, these viruses were relatively easy to identify before the era of NGS.Our bees that were selected for the analysis showed no overt signs of infection, such as deformities.To determine whether the newly described viruses are truly asymptomatic will require further study.The large number of found densoviruses with a low degree of nucleic acid similarities (23.9% to 44.4%) in two European countries (Czech Republic, Belgium) may indicate the presence of these and similar viruses also in other countries.The diversity of Densovirinae seems to be steadily increasing and new genera are being identified within the viruses previously known as ambidensoviruses.This high variability may not be limited to honey bee viruses like the Densovirinae.For instance, a recent study identified nine complete and six incomplete new species of chaphamaparvovi ruses in six chickens (56).
After the initial analysis of the obtained sequences, we found a long fragment with low identities to AmFV.With viral binning, we found possible other assembly fragments of the new DNA virus.The virus AmFLV was completed using PCR that connected two contigs in one linear viral genome with TRs and the length over a 152 kbp.The terminal repeats form Y-structures at the ends of the genome.It contains a wide range of ORFs encoding a range of proteins from polymerase to those that affect the host cytoskeleton.Notably, the newly described AmFLV contains all three pifs (1, 2, 3) which have been documented as the core of the per os infectivity complex in baculoviruses (52).Apart from that, of interest might be the homology of the viral sequence with the phage integrase family and probable integrase/recombinase.This protein is also present in AmFV but denoted as hypothetical.It is possible, that under certain conditions, the virus is able to utilize this protein for integration into the host genome.The protein has been described in some large viruses, for example, in Nudiviridae.Therefore, it could be involved in establishing of latency and integration into the host genome (26).Overall, AmFLV encodes 112 ORFs, but the function of most of them remains undescribed (hypothetical), even for those that we were able to find similarity, mainly to AmFV.
The remaining long contig has a very similar mapping pattern across all samples where the contigs were present.The contig is approximately 77 kbp in size.However, we didn't succeed in gaining its complete genome.It is also possible that this large fragment represents part of the genome of another large honey bee virus.Similar large DNA viruses with the same mapping patterns could easily be sorted as part of the vMAG of interest; therefore, the generation of vMAGs is extremely helpful in gaining genomic sequences but should be confirmed by a wet laboratory approach for some uses.
We completed the genome of nudivirus of honey bees using a combination of vMAGs generation, co-assembly, and amplicon sequencing.We gained a circular genome 129,467 bp long that contains the core genes of the Nudiviridae.In particular, lef-4, lef-5, lef-8, lef-9, and p47 are important for transcription, whereas pif-1-3, pif-4/19 kDa, and p74 are necessary for infection.There are several proteins important for viral morpho genesis, 38K, vp91, vlf-1, ac81, and vp39.The core genes also include proteins necessary for replication/recombination and repair like DNApol-B, helicase, and integrase.There are other important genes involved in nucleotide metabolism and some with unknown function (26).It is not surprising that the virus lacks 4 of the 29 suggested core genes (26) since, until present, only few nudiviruses have been described and the list of core genes is still changing.The virus encompasses a total of 108 ORFs, with significant alignments of predicted proteins to Nudiviridae, more precisely to Alphanudivirus (36.1% to 68.3%).The virus contains some proteins with reliable alignments to other viruses, like one alignment to AmFV, nucleopolyhedrovirus, or entomopoxvirus, but the similarities are low (11.8% to 47.1%).Phylogenetic analysis also revealed that the virus belongs into Alphanudivirus.The pathologies of Nudiviridae vary from virus to virus and between host's life stages.In insects, they can cause lethargy, weakness, malformations, stunted growth, reduced longevity, or fertility.They can also cause changes in the viscosity and color of the hemolymph (opalescent), or they can concentrate in the abdomen and form a "waxy plug" (26).However, further studies are needed to find out if a specific pathology linked to AmNV exists.Generally, the infection with these viruses is less symptomatic in comparison to Baculoviridae, and they seem to be restricted to certain cell types (26).
With the exception of two colonies showing varroosis symptoms, all other colonies used as source colonies for sampling were considered healthy, i.e., they built up quickly, had no damaged capping and no signs of overt disease, or Varroa infestation were observed.However, all the bees we selected for the virome analysis were free of malformation and showed no signs of overt pathology.However, it is not possible to completely exclude the possibility that individual bees may have signs of pathology that were not obvious and visible when the individual bees were selected prior to virome analysis.Further studies with different experimental design and sampling are needed to clarify this issue.
Even though the binning of sequencing data is regularly performed for bacterial and eukaryotic species, the binning and generation of MAGs is still a novel method for viruses.Only in recent years, new software that is designed for binning of viral sequences has been released.The tools like Coconet (57), vRhyme (34), and vamb (58) perform better for creating viral MAGs since they provide a larger number of cleaner bins with fewer misclassified contigs in comparison to other programs that are better suited for creating bacterial and eukaryotic MAGs (34,57,58).This was confirmed by testing on a limited number of samples as we tried MetaWRAP (59) and all the viral binning software mentioned above.For our data set, vRhyme (34) performed best, but it would be beneficial to have a bin refinement tool such as MetaWRAP (59) with a combination of different binning software for viruses.The vMAGs generated were instrumental in piecing together a significant portion of a new large viral genomes, which otherwise might have been overlooked since its similarities to large viruses affect the statistics (e.g., in CheckV) and are deemed very incomplete.The generation of vMAGs then allows for keeping maximum information while still filtering out very incomplete and poor fragments.It allows us to "connect" the contigs and treat them as fragmented genomes.
In conclusion, our study underscores the richness of the honey bee DNA virome, which was previously overshadowed by their RNA counterparts.The plethora of densoviruses identified, coupled with the discovery of the AmFLV with its predominantly hypothetical proteins, and first nudivirus in honey bee (AmNV), paves the way for deeper investigations into the ecological and pathological implications of these viruses in bee populations.
necessarily reflect those of the European Union or the European Health and Digital Executive Agency.Neither the European Union nor the granting authority can be held responsible for them.

FIG 2 FIG 3
FIG 2Predicted structures of terminal repeat from described densoviruses (RNA-fold program) as in reference(50).We can see familiar structures like Y-shape or I-shape.The color scale of pair-base probability is shown.

FIG 4
FIG 4 Information about new honey bee nudivirus.(A) Scheme of the genome.(B) Taxonomy of core genes * proposed genus (53).

TABLE 2
Detailed information about the TRs and ends of the densoviruses a a NP, not present.Research Article mSystems April 2024 Volume 9 Issue 4 10.1128/msystems.00088-247

TABLE 3
BLASTp results for complete densoviruses genome proteins a (Continued on next page)

TABLE 3
BLASTp results for complete densoviruses genome proteins a (Continued) a HP, hypothetical protein; NS, non-structural protein; SP, structural protein.

TABLE 4
Mapping of different samples from analyzed studies on three contigs belonging into one vMAG and to the whole AmFLV sequence completed by analyses of NGS available sequences and PCR a